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1 This application is submitted in the name of the following inventor(s) : 

2 

3 Inventor 

4 Blake LEWIS 

5 John EDWARDS 

6 Srinivasan VISWANATHAN 
7 

8 The assignee is Network Appliance. Inc., a California corporation 

9 having an office at 495 East Java Drive, Sunnyvale, CA 94089. 
10 

H Title of the Invention 

12 

12 Instant Snapshot 

14 

15 Background of the Invention 

16 

17 1. Field of Invention 

1 8 This invention relates to data storage systems. 



Citizenship Resi dence Citv and State 

USA Palo Alto, California 

USA Sunnyvale, California 

India Fremont, California 
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I This invention relates to data storage systems. 
2 

3 2. Related Art 
4 

5 Snapshots of a file system capture the contents of the files and directories in 

6 a file system at a particular point in time. Such snapshots have several uses. They allow 

7 the users of the file system to recover earlier versions of a file following an unintended 

8 deletion or modification. The contents of the snapshot can be copied to tape to provide a 

9 backup copy of the file system, and it can be copied to another file server and used as a 
10 replica. File systems, including the WAFL (Write Anywhere File Layout) file system, 

I I include a copy-on- write snapshot mechanism. Snapshot block ownership is recorded by 

12 updating the block's entry in the block map file, a bitmap associated with the vacancy of 

13 blocks. 
14 

1 5 One problem with the prior art of creating snapshots is that the requirement 

16 for additional file system metadata in the active filesystem to keep track of which blocks 

17 snapshots occupy. This metadata requires 4 bytes per 4-KB file system block, i.e., 

18 l/1024 th of the file system. These methods are inefficient both in their use of storage 
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I space and in the time needed to create the snapshots. 
2 

3 A second problem with earlier snapshot implementations, was the time 

4 consuming steps of writing out a description of the snapshot state on creation and 

5 removing it on deletion. 
6 

7 A third problem with earlier copy-on-write mechanisms, was the required 

8 steps consumed a considerable amount of time and file system space. For example, some 

9 systems, such as those supplied with DCE/DFS include a copy-on-write mechanism for 
10 creating snapshots (called "clones"). The copy-on-write mechanism was used to record 

II which blocks each clone occupied. Such systems require a new copy of the inode file 

12 and the indirect blocks for all files and directories are created when updating all of the 

13 original inodes. 
14 

15 Accordingly, it would be advantageous to provide an improved technique 

16 for more quickly and efficiently capturing the contents of the files and directories in the 

17 file system at a particular point in time. This is achieved in an embodiment of the 

1 8 invention that is not subj ect to the drawbacks of the related art. 
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I Summary of the Invention 

2 

3 The invention provides an improved method and apparatus for creating a 

4 snapshot of a file system. 
5 

6 In a first aspect of the invention, a "copy-on-write" mechanism is used. An 

7 effective snapshot mechanism must be efficient both in its use of storage space and in the 

8 time needed to create it because file systems are often large. The snapshot uses the same 

9 blocks as the active file system until the active file system is modified. Whenever a 
10 modification occurs, the modified data is copied to a new block and the old data is saved 

II (henceforth called "copy-on-write"). In this way, the snapshot only uses space where it 

12 differs from the active file system, and the amount of work required to create the 

13 snapshot initially is small. 
14 

15 In a second aspect of the invention, a record of which blocks are being used 

16 by the snapshot is included in the snapshot itself, allowing effectively instantaneous 

17 snapshot creation and deletion. 
18 
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1 In a third aspect of the invention, the state of the active file system is 

2 described by a set of metafiles; in particular, a bitmap (henceforth the "active map") 

3 describes which blocks are free and which are in use by the active file system. The inode 

4 file describes which blocks are used by each file, including the metafiles. The inode file 

5 itself is described by a special root inode, also known as the "fsinfo block. This copy of 

6 the root inode becomes the root of the snapshot. The root inode captures all required 

7 states for creating the snapshot such as the location of all files and directories in the file 

8 system, it. During subsequent updates of the active file system, the system consults the 

9 bitmap included in the snapshot (the "snapmap") to determine whether a block is free for 

10 reuse or belongs to a snapshot. This mechanism allows the active file system to keep 

1 1 track of which blocks each snapshot uses without recording any additional bookkeeping 

12 information in the file system. 
13 

14 In a fourth aspect of the invention, a snapshot can also be deleted 

1 5 instantaneously simply by discarding its root inode. Further bookkeeping is not required, 

16 because the snapshot includes it's own description. 
17 

18 In a fifth aspect of the invention, the performance overhead associated with 
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1 the search for free blocks is reduced by the inclusion of a summary file. The summary 

2 file identifies blocks that are used by at least one snapshot; it is the logical OR of all the 

3 snapmap files. The write allocation code decides whether a block is free by examining 

4 the active map and the summary file. The active map indicates whether the block is 

5 currently in use in the active file system. The summary file indicates whether the block is 

6 used by any snapshot. 
7 

8 In a sixth aspect of the invention, the summary file is updated in the 

9 background after the creation or deletion of a snapshot. This occurs concurrently with 

10 other file system operations. Two bits are stored in the file system "fsinfo block" for 

11 each snapshot. These two bits indicate whether the summary file needs to be updated 

12 using the snapshot's snapmap information as a consequence of its creation or deletion. 

13 When a block is freed in the active file system, the corresponding block of the summary 

14 file is updated with the snapmap from the most recently created snapshot, if this has not 

15 already been done. An in-core bit map records the completed updates to avoid repeating 

16 them unnecessarily. This ensures that the combination of the active bitmap and the 

17 summary file will consistently identify all blocks that are currently in use. Additionally, 

18 the summary file is updated to reflect the effect of any recent snapshot deletions when 
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1 freeing a block in the active file system. This allows reuse of blocks that are now entirely 

2 free. After updating the summary file following a snapshot creation or deletion, the 

3 corresponding bit in the fsinfo block is adjusted. 
4 

5 In a seventh aspect of the invention, the algorithm for deleting a snapshot 

6 involves examining the snapmaps of the deleted snapshot and the snapmaps of the next 

7 oldest and next youngest snapshot. A block that was used by the deleted snapshot but is 

8 not used by its neighbors can be marked free in the summary file, as no remaining 

9 snapshot is using it. However, these freed blocks cannot be reused immediately, as the 

10 snapmap of the deleted snapshot must be preserved until summary updating is complete. 

1 1 During a snapdelete free blocks are found by using the logical OR of the active bitmap, 

12 the summary file, and the snapmaps of all snapshots for which post-deletion updating is 

13 in progress. In other words, the snapmap of the deleted snapshot protects the snapshot 

14 from reuse until it is no longer needed for updating. 
15 

16 In the preferred embodiment, the invention is operative on WAFL file 

17 system. However, it is still possible for the invention to be applied to any computer data 

18 storage system such as a database system or a store and forward system such as cache or 
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1 RAM if the data is kept for a limited period of time. 

2 

3 Brief Description of the Drawings 

4 

5 Figure 1 shows a block diagram of a system for an instant snapshot. 

6 

7 Figure 2 shows a block diagram of an instant snapshot. 

8 

9 Figure 3 shows a flow diagram of a method for creating a snapshot. 

10 

11 Incorporated Disclosures 

12 

13 The inventions described herein can be used in conjunction with inventions 

14 described in the following applications: 
15 

16 • U.S. Patent Application Serial No. , Express Mail Mailing No. EL 

17 52478 1089US, filed August 18, 2000, in the name of Blake LEWIS, attorney docket 

18 number 103.1033.01, titled "Reserving File System Blocks" 
19 
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1 • U.S. Patent Application Serial No. , Express Mail Mailing No. 

2 EL524780242US, filed August 18, 2000, in the name of Rajesh SUNDARAM, 

3 attorney docket number 103.1034.01, titled "Dynamic Data Storage" 
4 

5 • U.S. Patent Application Serial No. , Express Mail Mailing No. 

6 EL524780256US, filed August 18, 2000, in the name of Ray CHEN, attorney docket 

7 number 103.1047.01, titled "manipulation of Zombie Files and Evil-Twin Files" 

8 

9 • Application Serial Number , in the names of Scott SCHOENTHAL, 

1 0 Express Mailing Number EL52478 1 075US, titled "Persistent and Reliable Delivery of 

1 1 Event Messages", assigned to the same assignee, attorney docket number 

12 103.1048.01, and all pending cases claiming the priority thereof. 

13 and 

14 • Application Serial Number , in the names of Douglas P. DOUCETTE, 

1 5 Express Mailing Number EL52478 1 092US, titled "Improved Space Allocation in a 

16 Write Anywhere File System", assigned to the same assignee, attorney docket number 

17 1 03 . 1 045 .0 1 , and all pending cases claiming the priority thereof. 
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1 

2 Detailed Description of the Preferred Embodiment 

3 

4 In the following description, a preferred embodiment of the invention is 

5 described with regard to preferred process steps and data structures. However, those 

6 skilled in the art would recognize, after perusal of this application, that embodiments of 

7 the invention might be implemented using a variety of other techniques without undue 

8 experimentation or further invention, and that such other techniques would be within the 

9 scope and spirit of the invention. 
10 Lexicography 

11 

12 As used herein, use of the following terms refer or relate to aspects of the 

13 invention as described below. The general meaning of these terms is intended to be 

14 illustory and in no way limiting. 
15 

16 • Inode - In general, the term "mode" refers to data structures that include information 

17 about files in Unix and other file systems. Each file has an inode and is identified by 

18 an inode number (i-number) in the file system where it resides. Inodes provide 
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1 important information on files such as user and group ownership, access mode (read, 

2 write, execute permissions) and type. An inode points to the file blocks or indirect 

3 blocks of the file it represents. 
4 

5 • Sector - In general, the term "sector" refers to a physical section of a disk drive 

6 including a collection of bytes, such as 512 bytes. 
7 

8 • Data Storage Block - In general, the phrase "data storage block" refers to specific 

9 allocation areas on a hard disk. The allocation area is a collection of sectors, such as 
10 8 sectors or 4,096 bytes, commonly called 4K bytes or 4-KB. 

11 

12 • File Block - In general, the phrase "file block" refers to a standard size block of data 

13 including some or all of the data in a file. In the preferred embodiment, the file block 

14 is the same size as a data storage block. 
15 

16 • fsinfo (File System Information Block) - In general, the phrase "file system 

17 information block" refers to one or more copies of a block known as the "fsinfo 

18 block". These blocks are located at fixed locations on the disks. The fsinfo block 
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1 includes data about the volume including the size of the volume, volume level 

2 options, language and more. 
3 

4 • WAFL (Write Anywhere File Layout) - In general, the term "WAFL" refers to a 

5 high level structure for a file system. Pointers are used for locating data. All the data 

6 is included in files. These files can be written anywhere on the disk in chunks of file 

7 blocks placed in data storage blocks. 
8 

9 • Volume In general, the term "volume" refers to a single file system. The file system 

10 may be composed of a collection of disk drives. 
11 

12 • Consistency Point (CP) - In general, the term "CP" refers to a time that a file system 

13 reaches a consistent state. When this state is reached, all the files have been written to 

14 all the blocks and are safely on disk and the one or more copies of redundant fsinfo 

15 blocks get written out. If the system crashes before the fsinfo blocks go out, all other 

16 changes are lost and the system reverts back to the last CP. The file system advances 

17 atomically from one CP to the next. 
18 
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1 • Consistent State - In general, the phrase "consistent state" refers to the system 

2 configuration of files in blocks after the CP is reached. 
3 

4 • Range - In general, the term "range" refers to a group of blocks, such as 1,024 blocks. 
5 

6 • Active file system - In general, the phrase "active file system" refers to the current 

7 file system arrived at with the most recent CP. In the preferred embodiment, the 

8 active file system includes the active map, the summary map and points to all 

9 snapshots and other data storage blocks through a hierarchy of inodes, indirect data 
10 storage blocks and more. 

11 

12 • Active map - In general, the phrase "active map" refers to a to a file including a 

13 bitmap associated with the vacancy of blocks of the active file system. 
14 

15 • Snapshot - In general, the term "snapshot" refers to a copy of the file system. The 

16 snapshot diverges from the active file system over time as the active file system is 

17 modified. A snapshot can be used to return the file system to a particular CP 

1 8 (consistency point) . 
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1 • Snapmap - In general, the term "snapmap" refers to a file including a bitmap 

2 associated with the vacancy of blocks of a snapshot. The active map diverges from a 

3 snapmap over time as the blocks used by the active file system change during 

4 consistency points. 
5 

6 • Summary map - In general, the term "summary map" refers to a file including an 

7 IOR (inclusive OR) bitmap of all the snapmaps. 

8 

9 • Space map - In general, the term "space map" refers to a file including an array of 
10 numbers which describe the number of storage blocks used in an allocation area. 
11 

12 • Blockmap - In general, the term "blockmap" refers to a map describing the status of 

1 3 the blocks in the file system. 
14 

15 • Snapdelete - In general, the term "snapdelete" refers to an operation that removes a 

16 particular snapshot from the file system. This command can allow a storage block to 

17 be freed for reallocation provided no other snapshot or the active file system uses the 

18 storage block. 
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1 • Snapcreate - In general, the term "snapcreate" refers to the operation of retaining a 

2 consistency point and preserving it as a snapshot. 
3 

4 As described herein, the scope and spirit of the invention is not limited to 

5 any of the definitions or specific examples shown therein, but is intended to include the 

6 most general concepts embodied by these and other terms. 
7 

8 System Elements 
9 

10 Figure 1 shows a block diagram of a system for an instant snapshot. 
11 

12 The root block 100 includes the inode of the inode file 105 plus other 

13 information regarding the active file system 110, the active map 115, previous active file 

14 systems known as snapshots 120, 125, 130 and 135 and their respective snapmaps 140, 

15 145, 150 and 155. 
16 

17 The active map 115 of the active file system 110 is a bitmap associated 

18 with the vacancy of blocks for the active file system 110. The respective snapmaps 140, 
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1 145, 150 and 155 are active maps can be associated with particular snapshots 120, 125, 

2 130 and 135 and an inclusive OR summary map 160 of the snapmaps 140, 145, 150 and 

3 155. Also shown are other blocks 115 including double indirect blocks 130 and 132, 

4 indirect blocks 165, 166 and 167 and data blocks 170, 171, 172 and 173. Finally, Figure 

5 1 shows the spacemap 180 including a collection of spacemap blocks of numbers 181, 

6 182, 183, 184 and 190. 
7 

8 The root block 100 includes a collection of pointers that are written to the 

9 file system when the system has reached a new CP (consistency point). The pointers are 

10 aimed at a set of indirect (or triple indirect, or double indirect) inode blocks (not shown) 

11 or directly to the inode file 105 consisting of a set of blocks known as inode blocks 191, 

12 192, 193, 194 and 195. 
13 

14 The number of total blocks determines the number of indirect layers of 

15 blocks in the file system. The root block 100 includes a standard quantity of data, such as 

16 128 bytes. 64 of these 128 bytes describe file size and other properties; the remaining 64 

17 bytes are a collection of pointers to the inode blocks 191, 192, 193, 194 and 195 in the 

18 inode file 105. Each pointer in the preferred embodiment is made of 4 bytes. Thus, there 
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1 are approximately 16 pointer entries in the root block 100 aimed at 16 corresponding 

2 inode blocks of the inode file 105 each including 4K bytes. If there are more than 16 

3 inode blocks, indirect inode blocks are used. 
4 

5 In a preferred embodiment, file blocks are 4096 bytes and inodes are 128 

6 bytes. It follows that each block of the inode file contains 32 (i.e. 4,096/128) separate 

7 inodes that point to other blocks 1 15 in the active file system. 
8 

9 Inode block 193 in the inode file 105 points to a set of blocks (1, 2, 3, P) 

10 called the active map 115. Each block in the active map 1 15 is a bitmap where each bit 

11 corresponds to a block in the entire volume. A "1" in a particular position in the bitmap 

12 correlates with a particular allocated block in the active file system 110. Conversely, a 

13 "0" correlates to the particular block being unused by the active file system 110. Since 

14 each block in the active map 115 can describe up to 32K blocks or 128 MB, 8 blocks are 

1 5 required per GB, 8K blocks per TB. 
16 

17 Another inode block in the inode file 105 is inode block N 212. This block 

18 includes a set of pointers to a collection of snapshots 120, 125, 130 and 135 of the 
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1 volume. Each snapshot includes all the information of a root block and is equivalent to 

2 an older root block from a previous active file system. The snapshot 120 may be created 

3 at any past CP. Regardless when the snapshot is created, the snapshot is an exact copy 

4 of the active file system at that time. The newest snapshot 120 includes a collection of 

5 pointers that are aimed directly or indirectly to the same inode file 105 as the root block 

6 100 of the active file system 110. As the active file system 110 changes (generally from 

7 writing files, deleting files, changing attributes of files, renaming file, modifying their 

8 contents and related activities), the active file system and snapshot will diverge over time. 

9 Given the slow rate of divergence of an active file system from a snapshot, any two 

10 snapshots will share many of the same blocks. The newest snapshot 120 is associated 

11 with snapmap 140. Snapmap 140 is a bit map that is initially identical to the active map 

12 115. The older snapshots 125, 130 and 194 have a corresponding collection of snapmaps 

13 145, 150 and 155. Like the active map 115, these snapmaps 145, 150 and 155 include a 

14 set of blocks including bitmaps that correspond to allocated and free blocks for the 

15 particular CP when the particular snapmaps 145, 150 and 155 were created. Any active 

16 file system may have a structure that includes pointers to one or more snapshots. 

17 Snapshots are identical to the active file system when they are created. It follows that 

18 snapshots contain pointers to older snapshots. There can be a large number of previous 
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1 snapshots in any active file system or snapshot. In the event that there are no snapshot, 

2 there will be no pointers in the active file system. 
3 

4 Blocks not used in the active file system 110 are not necessarily available 

5 for allocation or reallocation because the blocks may be used by snapshots. Blocks used 

6 by snapshots are freed by removing a snapshot using the snapdelete command. When a 

7 snapshot is deleted any block used only by that snapshot and not by other snapshots nor 

8 by the active file system becomes free for reuse by WAFL. If no other snapshot or active 

9 files uses the block, then the block can be freed and written over during the next copy on- 

10 wrote-execution by WAFL. The system can relatively efficiently determine whether a 

11 block can be removed using the "nearest neighbor rule". If the previous and next 

12 snapshot do not allocate a particular block in their respective snapmaps, then the block 

13 can be freed for reuse by WAFL. For WAFL to find free space to write new data or 

14 metadata, it could search the active map 115 and the snapmaps (140, 145, 150 and 155) 

15 of the snapshots (120, 125, 130 and 135) to find blocks that are totally unused. This 

16 would be very inefficient; thus it is preferable to use the active map and the summary 

17 map as described below 
18 

Page 19 

Express Mailing No. EL 524 780 239 US 



103.1035.01 

1 A summary map 160 is created by using an IOR (inclusive OR) operation 

2 139 on the snapmaps 140, 145, 150 and 155. Like the active map 115 and the snapmaps 

3 140, 145, 150 and 155, the summary map 160 is a file whose data blocks (1, 2, 3, ...Q) 

4 contained a bit map. Each bit in each block of the summary map. describes the allocation 

5 status of one block in the system with "1" being allocated and "0" being free. The 

6 summary map 160 describes the allocated and free blocks of the entire volume from all 

7 the snapshots 120, 125, 130 and 135 combined. The use of the summary file 160 is to 

8 avoid overwriting blocks in use by snapshots. 
9 

10 An IOR operation on sets of blocks (such as 1,024 blocks) of the active 

11 map 115 and the summary map 160 produces a spacemap 180. Unlike the active map 

12 115 and the summary map 160, which are a set of blocks containing bitmaps, the 

13 spacemap 180 is a set of blocks including 181, 182, 183, 184, and 190 containing arrays 

14 of binary numbers. The binary numbers in the array represent the addition of all the 

15 vacant blocks in a region containing a fixed number of blocks, such as 1,024 blocks. The 

16 array of binary numbers in the single spacemap block 181 represents the allocation of all 

17 blocks for all snapshots and the active file system in one range of 1,024 blocks. Each of 

18 the binary numbers 181, 182, 183, 184, and 190 in the array are a fixed length. In a 
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1 preferred embodiment, the binary numbers are 16 bit numbers, although only 10 bits are 

2 used. 

3 

4 In a preferred embodiment, the large spacemap array binary number 182 

5 (0000001 1 1 1 1 1 1 1 10=1,021 in decimal units) tells the file system that the corresponding 

6 range is relatively full. In such embodiments, the largest binary number 

7 00001111111111 (1,023 in decimal) represents a range containing at most one empty.. 

8 The small binary number 184 (0000000000001 1 10-13 in decimal units) instructs the file 

9 system that the related range is relatively empty. The spacemap 180 is thus a 

10 representation in a very compact form of the allocation of all the blocks in the volume 

11 broken into 1,024 block sections. Each 16 bit number in the array of the spacemap 180 

12 corresponds to the allocations of blocks in the range containing 1,024 blocks or about 4 

13 MB. Each spacemap block 180 has about 2,000 binary numbers in the array and they 

14 describe the allocation status for 8 GB. Unlike the summary map 120, the spacemap 

15 block 1 80 needs to be determined whenever a file needs to be written. 
16 

17 Figure 2 shows a block diagram of an instant snapshot. 

18 
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1 The old root block 200 of snapshot #1 201 includes the inode of the inode 

2 file 202 plus other information regarding the previous active file system known as 

3 snapshot #1 201, the snap map 205, earlier active file systems known as snapshot #2 210, 

4 snapshot #3 215 and snapshot #4 220, and their respective snapmaps 225, 230 and 235. 
5 

6 The snapmap 205 of the previous active file system, snapshot #1 201, is a 

7 bitmap associated with the vacancy of blocks for snapshot #1 201. The respective 

8 snapmaps 225, 230 and 235 are earlier active maps that can be associated with particular 

9 snapshots 210, 215 and 220 and an inclusive OR summary map 245 of the snapmaps 225, 

10 230 and 235. Also shown are other blocks 211 including double indirect blocks 240 and 

11 332, indirect blocks 250, 251 and 252 and data blocks 260, 262, 263 and 264. Finally, 

12 Figure 2 shows the spacemap 270 of snapshot #1 201 including a collection of spacemap 

13 blocks of binary numbers 272, 273, 274, 275 and 276. 
14 

15 The old root block 200 includes a collection of pointers that are written to 

16 the previous active file system when the system had reached the previous CP. The 

17 pointers are aimed at a set of indirect (or triple indirect, or double indirect) inode blocks 

18 (not shown) or directly to the inode file 202 consisting of a set of blocks known as inode 
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1 blocks 281, 282, 283, 284 and 285. 

2 

3 An inode block 281 in the inode file 202 points to other blocks 328 in the 

4 old root block 201 starting with double indirect blocks 240 and 332 (there could also be 

5 triple indirect blocks). The double indirect blocks 240 and 332 include pointers to 

6 indirect blocks 250, 251 and 252. The indirect blocks 250, 251 and 252 include pointers 

7 that are directed to data leaf blocks 260, 262, 263 and 264 of the active file system 20 1 . 
8 

9 Inode block 283 in the inode file 202 points to a set of blocks (1, 2, 3, P) 

10 called the snap map 205. Each block in the snap map 205 is a bitmap where each bit 

1 1 corresponds to a block in the entire volume. A "1 " in a particular position in the bitmap 

12 correlates with a particular allocated block in the active file system 201. Conversely, a 

13 "0" correlates to the particular block being free for allocation in the old root block 201. 

14 Each block in the snap map 205 can describe up to 32K blocks or 128 MB. 
15 

16 Inode file 202 also includes inode block N 285. This block includes a set of 

17 pointers to a collection of earlier snapshots, snapshot #2 210, shapshot #3 215 and 

18 snapshot #4 220 of the volume. Each snapshot includes all the information of a root 
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1 block and is equivalent to an older root block from a previous active file system. 
2 

3 Snapshot #1 201 also includes an old summary map 245 and old spacemap 

4 blocks 270. Although these blocks of data are included in snapshot #1 201 and previous 

5 snapshots, in a preferred embodiment, this data is not used by the active file system of 

6 figure 2. 
7 

8 Method of Use 
9 

10 Figure 3 shows a flow diagram of a method for using a system as shown in 

11 figure 1. 
12 

13 A method 300 is performed by the file system 100. Although the method 

14 400 is described serially, the steps of the method 300 can be performed by separate 

15 elements in conjunction or in parallel, whether asynchronously, in a pipelined manner, or 

16 otherwise. There is no particular requirement that the method 300 be performed in the 

17 same order in which this description lists the steps, except where so indicated. 
18 
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1 At a flow point 305, the file system 100 is ready to perform a method 300. 

2 

3 At a step 3 10, a user will request a snapshot of the file system 100. 

4 

5 At a step 315, a timer associated with the file system 100 initiates the 

6 creation of a new snapshot. 
7 

8 At a step 320, the file system 100 receives a request to make a snapshot. 

9 

10 At a step 325, the file system 100 creates a new file. 

11 

12 At a step 330, the root node of the new file points to the root node of the 

13 current active file system. 
14 

15 At a step 335, the file system 100 makes the file read only. 

16 

17 At a step 340, the file system 100 updates the new summary map by using 

18 an inclusive OR of the most recent snapmap and the existing summary file. This step 
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1 must be done before any blocks are freed in the corresponding active map block. If 

2 multiple snapshots are created such that the processing overlaps in time, the update in 

3 step 340 need only be done for the most recently created snapshot. 
4 

5 At a flow point 345, the snapshot create and the summary file update is 

6 completed and the snapshot creation is done. 
7 

8 An analogous method may be performed for snapshot delete. 

9 

1 0 Alternative Embodiments 
11 

12 Although preferred embodiments are disclosed herein, many 

13 variations are possible which remain within the concept, scope, and spirit of the 

14 invention, and these variations would become clear to those skilled in the art after perusal 

15 of this application. 
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1 Claims 

2 

3 1 A method for capturing the contents of the files and directories in a 

4 file system, said file system comprising a set of storage blocks in a mass storage system 

5 including steps for 

6 recording an active map in said file system of said storage blocks not 

7 available for writing data; 

8 recording a consistency point in said file system including a consistent 

9 version of said file system at a previous time, said consistency point including a copy of 

10 said active map at said previous time; 

11 refraining from writing data to storage blocks in response to said active 

12 map; and 

13 at least one of said copy of said active map at said previous time. 
14 

15 2. A method as in claim 1, wherein said step for refraining includes 

16 determining a logical union of said storage blocks used by one or more of said copies of 

17 said active map at said previous time. 
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1 

2 3. A method as in claim 1, wherein said step for refraining includes 

3 determining a subset of said storage blocks used by one or more of said copies of said 

4 active map at said previous time. 

5 



6 4. A method as in claim 1, wherein said file system is a WAFL file 

7 system. 

8 

9 5. A method as in claim 1, wherein said active map at said previous 

10 time is a snapmap. 

11 

12 6. A method as in claim 1 and 5, including removing a root inode of 

13 said snapmap using a snap delete. 

14 

15 7 A method as in claim 6, including steps for determining not to write 

16 to a block after said step, provided the previous or next snapmap uses said block. 
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1 



2 8. A method as in claim 1, including a copy-on- write mechanism for 

3 copying modified data to a new block and saving old data in a current data block. 

4 

5 9. A method for capturing the contents of the files and directories in a 

6 file system, said file system comprising a set of storage blocks in a mass storage system 

7 including 

8 recording a consistency point in said file system including a consistent 

9 version of said file system at a previous time, said consistency point including a copy of 

10 said active map at said previous time; and 

1 1 returning to said file system at a previous time using said consistent version 

12 of said file system following an unintended deletion or modification. 
13 

14 10. A method as in claim 9, wherein said consistent version includes a 

1 5 pointer to a previous root block of the inode file. 
16 

17 11. A method as in claim 9, wherein said file system is a WAFL file 

18 system. 
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1 



2 12. A method as in claim 9, wherein said active map at said previous 

3 time is a snapmap. 

4 13. A method as in claim 9 and 12, including a snapdelete method for 

5 removing a root inode of said snapmap. 

6 

7 14. A method as in claim 13, including steps for determining not to write 

8 to a block after said snapdelete method provided a previous or next snapmap uses said 

9 block. 



10 



11 15. A method as in claim 9, including a copy-on-write mechanism for 

12 copying modified data to a new block and saving old data in a current data block. 

13 

14 16. A method for saving previous versions of an active file system 

15 including the contents of the files and directories in a file system, said file system 

16 comprising a set of storage blocks in a mass storage system including steps for 

17 writing modified files to unused data blocks; 
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1 keeping previous files in currently occupied blocks; and 

2 recording a consistency point in said file system including a consistent 

3 version of said file system at a previous time, said consistency point including a copy of 

4 said active map at said previous time; 
5 



6 17. A method as in claim 16, including retrieving said file system at a 

7 previous time using a pointer. 
8 

9 18. A method as in claim 16, wherein said pointer corresponds to a root 

10 block of said file system at a previous time. 
11 

12 19. A method as in claim 16, wherein said file system is a WAFL file 

13 system. 

14 

15 20. A method as in claim 16, wherein said active map at said previous 

16 time is a snapmap. 

17 

18 21. A method as in claim 16 and 20, including a snapdelete method for 
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1 removing a root inode of said snapmap. 

2 

3 22. A method as in claim 20, including not writing to a block after said 

4 snapdelete method provided a previous or next snapmap uses said block. 

5 

6 23. A method as in claim 16, including a copy-on-write mechanism for 

7 copying modified data to a new block and saving old data in a current data block. 
8 
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1 Abstract of the Disclosure 
2 

3 The invention provides an improved method and apparatus for creating a 

4 snapshot of a file system. In a first aspect of the invention, a "copy-on-write" mechanism 

5 is used. An effective snapshot mechanism must be efficient both in its use of storage 

6 space and in the time needed to create it because file systems are often large. The 

7 snapshot uses the same blocks as the active file system until the active file system is 

8 modified. Whenever a modification occurs, the modified data is copied to a new block 

9 and the old data is saved (henceforth called "copy-on-write"). In this way, the snapshot 

10 only uses space where it differs from the active file system, and the amount of work 

1 1 required to create the snapshot is small. In a second aspect of the invention, a record of 

12 which blocks are being used by the snapshot is included in the snapshot itself, allowing 

13 effectively instantaneous snapshot creation and deletion. In a third aspect of the 

14 invention, the state of the active file system is described by a set of metafiles; in 

15 particular, a bitmap (henceforth the "active map") describes which blocks are free and 

16 which are in use. The inode file describes which blocks are used by each file, including 

17 the metafiles. The inode file itself is described by a special root inode, also known as the 

18 "fsinfo block". The system begins creating a new snapshot by making a copy of the root 
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1 inode. This copy of the root inode becomes the root of the snapshot. The root inode 

2 captures all required states for creating the snapshot such as the location of all files and 

3 directories in the file system, it. During subsequent updates of the active file system, the 

4 system consults the bitmap included in the snapshot (the "snapmap") to determine 

5 whether a block is free for reuse or belongs to the snapshot. This mechanism allows the 

6 active file system to keep track of which blocks each snapshot uses without recording any 

7 additional bookkeeping information in the file system. In a fourth aspect of the 

8 invention, a snapshot can also be deleted instantaneously simply by discarding its root 

9 inode. Further bookkeeping is not required, because the snapshot includes it's own 

10 description. In a fifth aspect of the invention, the performance overhead associated with 

1 1 the search for free blocks is reduced by the inclusion of a summary file. The summary 

12 file identifies blocks that are used by at least one snapshot; it is the logical OR of all the 

13 snapmap files. The write allocation code decides whether a block is free by examining 

14 the active map and the summary file. The active map indicates whether the block is 

15 currently in use in the active file system. The summary file indicates whether the block is 

16 used by any snapshot. In a sixth aspect of the invention, the summary file is updated in 

17 the background after the creation or deletion of a snapshot. This occurs concurrently 

18 with other file system operations. Two bits are stored in the file system "fsinfo block" 
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1 for each snapshot. These two bits indicate whether the summary file needs to be updated 

2 using the snapshot's snapmap information as a consequence of its creation or deletion. 

3 When a block is freed in the active file system, the corresponding block of the summary 

4 file is updated with the snapmap from the most recently created snapshot, if this has not 

5 already been done. An in-core bit map records the completed updates to avoid repeating 

6 them unnecessarily. This ensures that the combination of the active bitmap and the 

7 summary file will consistently identify all blocks that are currently in use. Additionally, 

8 the summary file is updated to reflect the effect of any recent snapshot deletions when 

9 freeing a block in the active file system. This allows reuse of blocks that are now entirely 

10 free. After updating the summary file following a snapshot creation or deletion, the 

1 1 corresponding bit in the fsinfo block is adjusted. In a seventh aspect of the invention, the 

12 algorithm for deleting a snapshot involves examining the snapmaps of the deleted 

13 snapshot and the snapmaps of the next oldest and next youngest snapshot. A block that 

14 was used by the deleted snapshot but is not used by its neighbors can be marked free in 

15 the summary file, as no remaining snapshot is using it. However, these freed blocks 

16 cannot be reused immediately, as the snapmap of the deleted snapshot must be preserved 

17 until summary updating is complete. During a snapdelete, free blocks are found by using 

18 the logical OR of the active bitmap, the summary file, and the snapmaps of all snapshots 
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1 for which post-deletion updating is in progress. In other words, the snapmap of the 

2 deleted snapshot protects the snapshot from reuse until it is no longer needed for 

3 updating. In the preferred embodiment, the invention is operative on WAFL file system. 

4 However, it is still possible for the invention to be applied to any computer data storage 

5 system such as a database system or a store and forward system such as cache or RAM if 

6 the data is kept for a limited period of time. 
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